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Abstract. Several mathematical ideas have been investigated for Quan- 
titative Information Flow. Information theory, probability, guessability 
are the main ideas in most proposals. They aim to quantify how much 
information is leaked, how likely is to guess the secret and how long 
does it take to guess the secret respectively. In this paper, we show how 
the Lattice of Information provides a valuable foundation for all these 
approaches; not only it provides an elegant algebraic framework for the 
ideas, but also to investigate their relationship. In particular we will use 
this lattice to prove some results establishing order relation correspon- 
dences between the different quantitative approaches. The implications 
of these results w.r.t. recent work in the community is also investigated. 
While this work concentrates on the foundational importance of the 
Lattice of Information its practical relevance has been recently proven, 
notably with the quantitative analysis of Linux kernel vulnerabilities. 
Overall we believe these works set the case for establishing the Lattice 
of Information as one of the main reference structure for Quantitative 
Information Flow. 



1. Introduction 

Quantitative security analysis should be able to address confidentialitjQ 
comparison questions like: "given programs P and P' which one is more of 
a threat?" This comparison problems is related to the other fundamental 
question that a quantitative security analysis should be able to address: 
"how much of a threat is program P?" 

Quantitative analyses are based on some measure, usually a real number. 
This number may answer the comparison problems by reducing it to a nu- 
merical comparison and the second question by considering the magnitude 
of the number in relation to the size of the secret. In many of these measures 
the number has been shown to characterise secure programs. 

In recent years a number of ideas have emerged as reasonable measures 
for Quantitative Information Flow (abbreviated as QIF): Information The- 
ory, probabilistic measures and guessability [CHM21 ICMSi \KB\ ISmj . The 



^In this work we restrict ourselves to security as confidentiality 
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information theoretical concepts of entropy, conditional entropy and mutual 
information have been used to answer questions like "how much information 
can an attacker gain from observing the system?" whereas probabilities can 
be used to answer questions like "how likely is that the attacker may guess 
the secret in n tries after observing the system?" and guessability measures 
the question "what is the number of guesses needed to guess the secret after 
the observations?" 

There seems to be an intuitive connection between these questions, but 
the connection is not trivial; in fact some deep differences have been noticed 
in these approaches |Sm] . In the context of QIF the differences seems mainly 
to relate to the variety of attackers models and of what the scope of modelling 
should be. 

In this work we aim to relate the confidentiality comparison questions in 
probabilistic, guessability and information theoretical approaches. We will 
do this by studying their relation to an algebraic structure: the Lattice of 
Information (abbreviated as Lol). 

The Lattice of Information is the lattice of all equivalence relations on 
a set; by identifying observations over a system as the equivalence relation 
equating all (secret) states that cannot be distinguished by those observa- 
tions we see Lol as the mathematical model for all observations generated 
by all possible deterministic systems over a set of (secret) states. 

This allows for an elegant analysis decomposition of QIF into two steps, 
the first being an algebraic interpretation, the second being a numerical 
evaluation: 

(1) interpret the attacker view of the system as an equivalence relation 
identifying the states indistinguishable by the attacker through the 
observations, 

(2) measure the above equivalence relation. This measure should provide 
an indication of the leakage of confidential information (or vulnera- 
bility) of the system. 

While these equivalence relations have been successfully used in recent years 
[CHM2tlM2PKBj . we aim here to prove some fundamental results about their 
algebraic structure. 

Given two systems S, S' and the associated equivalence rela- 
tions ~5,~5/ we will show the following equivalences: 

(1) ~5'/ refines ~5 

(2) the leakage of S is always less than the leakage of S' 
(leakage measured by Shannon entropy). 

(3) the expected probability of guessing the secret in n tries 
according to ~5' is always less than the expected prob- 
ability of guessing the secret in n tries according to ~5'' 

(4) the expected numbers of guesses needed to guess the 
secret according to ~5/ is always less than the expected 
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numbers of guesses needed to guess the secret according 
to ~5 

In other terms given two programs P, P' to determine whether P' refines P 
(as observational equivalence relations) is the same as to determine whether 
is always the case that it is more likely to guess the secret using P' instead 
of P. This is also the same as to determine whether the entropy of P is 
always less than the entropy of P' . Moreover these results are shown to be 
consistent with different definitions of Quantitative Information Flow based 
on the adversary gain through observations i.e. the difference in threat 
before and after observations are made |TA) . 

These results hence provide a clear connection between the algebraic, 
probabilistic and information theoretical view of leakage. 

The work also contributes to the foundations of Quantitative Information 
Flow, in particular to the important work by G. Smith |Sm] . where the 
difference between the "one guess" model and the information theoretical 
one were insightfully debated. What Smith noticed was that there exist 
programs such that, assuming a uniform distribution of the secret, their 
information theoretical measure is the same but whose vulnerability to a one 
guess attack is very different. In the argument it is important to consider a 
specific (in this case uniform) distribution. It is arguable however that code 
analysis should be affected by an element independent of the code, in this 
case the distribution. What our result shows is that if we argue about the 
relative vulnerability to n tries attack of two programs and the argument is 
not dependent on a specific distribution then their relative vulnerability is 
determined by their Lol order or equivalently by their entropy order. 

The algebraic aspect of QIF, i.e. the Lol interpretation of programs is far 
from being a pure academic exercise; in fact it has informed works integrat- 
ing QIF with verification techniques |BKR1 IHM1| where model checkers and 
sat-solvers are used to build the equivalence ~5 associated to a program. 
More recently these ideas have been exploited to build the first quantita- 
tive analysis for real code leakage, in particular to quantify leakage of Linux 
kernel functions |HM1| . These works make use of a basic relation between 
Lol and Information Theory, i.e. the fact that log(| ~5 |) is the channel 
capacity of the system S, i.e. the maximum amount that S can leak. 



2. Basics 

2.1. Observations and the lattice of information. We can see obser- 
vations over a system as some partial information on systems' states, in 
that an observation reveals some information about the states of the sys- 
tem. Some systems may allow for observations revealing no information 
(all states are possible according to that system's observations) while other 
systems may allow for observations revealing complete information on the 
states of the system. 
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We will make an important determinacy assumption about observations, 
i.e. that a system's observations form a partition on the set of all possible 
states: a block in this partition is the set of states that are indistinguishable 
by that observation. This assumption is satisfied for example in the setting 
of sequential languages when we take as observations the program outputs 
because the inverse image of a function form a partition on the function 
domain. 

In this work we will use the terms partition or equivalence relation in- 
terchangeably. An equivalence relation can always be seen as the partition 
whose blocks are the equivalence classes and a partition can always be seen 
as the equivalence relation defined by two objects are related iff they are in 
the same block. 

2.2. Partitions and equivalence relations as lattice points. Given a 
finite set S the set of all possible equivalence relations over S is a complete 
lattice: the Lattice of Information (abbreviated as Lol) |LR| . Order on 
equivalence relations is the refinement order. 

Formally let us define the set Lol as the set of all possible equivalence 
relations on a set S. Given ss, ~ G Lol and di, (T2 £ S the ordering of Lol 
is defined as 

(1) R=: IZ o Vcri,CJ2 (cJi ~ (J2 =^ fJi (T2) 

This is a refinement order: classes in ~ refine (split) classes in ~. Thus, 
higher elements in the lattice can distinguish more while lower elements in 
the lattice can distinguish less states. It easily follows from ([T]) that Lol is 
a complete lattice. 

Alternatively the lattice operations join U and meet □ are defined as 
the intersection of relations and the transitive closure union of relations 
respectively. 

The restriction to consider finite lattices is motivated by considering in- 
formation storable in programs variables: such information is < 2^ where k 
is the number of bits of the secret variable. 

In terms of partitions, a partition is above another if it is more informa- 
tive, i.e. each block in the lower partition is included in a block in the above 
partition 

Here is an example of how these equivalence relations can be used in an 
information flow setting. Let us assume the set of states S consists of a 
tuple {I, h) where I is an observable, usually called low, variable and h is 
a confidential variable, usually called high. One possible observer can be 
described by the equivalence relation 

{ll,hi) fa (/2,/l2) ^ll=l2 

That is the observer can only distinguish two states whenever they agree on 
the low variable part. Clearly, a more powerful attacker is the one who can 
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distinguish any two states from one another, or 

{li,hi) ~ (^2, h2) h = I2 Ahi = h2 

The '^-observer gains more information than the ^-observer by comparing 
states, therefore ^ \Z ^. 

2.3. Lattice of information as a lattice of random variables. A ran- 
dom variable (noted r.v.) is usually defined as a map X : Z) — t- M, where D 
is a finite set with a probability distribution and the real numbers M is the 
range of X. For each element d D, its probability will be denoted //(d). 
For every element x £ M we write fj,{X = x) (or often in short ^{x)) to mean 

def 

the probability that X takes on the value rr, i.e. ^{x) = "^Zdex-^ix) 
In other words, what we observe by X = x is that the input to X in D 
belongs to the set X~^{x). From that perspective, X partitions the space 
D into sets which are indistinguishable to an observer who sees the value 
that X takes oiil. This can be stated relationally by taking the kernel of X 
which defines the following equivalence relation ker(X): 

(2) d ker(X) d' iff X{d) = X{d') 
Equivalently we write X c^Y whenever the following holds 

X ~ y iff {X-^{x) : x G M} = {Y~^{y) : y G M} 

and thus if X ~ y then H{X) = H{Y). 

This shows that each element of the lattice Lol can be seen as a random 
variable. 

Given two r.v. X, Y in Lol we define the joint random variable (X, Y) as 
their least upper bound in Lol i.e. X U y. It is easy to verify that X U y 
is the partition obtained by all possible intersections of blocks of X with 
blocks of y. 

2.4. Basic concepts of Information Theory. This section contains a 
very short review of some basic definitions of Information Theory; additional 
background is readily available both in textbooks (the standard being Cover 
and Thomas textbook |CT| ). Given a space of events with probabilities 
P = {pi)i<^N is a set of indices) the Shannon's entropy is defined as 

(3) H{X) = -Y^Pdogpi 

It is usually said that this number measures the average information content 
of the set of events: if there is an event with probability 1 then the entropy 
will be and if the distribution is uniform i.e. no event is more likely 
than any other the entropy is maximal, i.e. log|X|. In the literature the 
terms information content and uncertainty in this context are often used 



'We define an event for the random variable a block in the partition. 
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interchangeably: both terms refer to the number of possible distinctions on 
the set of events in the sense we discussed before. 

The entropy of a r.v. X is just the entropy of its probability distribution 
i.e. 

xex 

Given two random variables X and Y, the joint entropy H{X, Y) measures 
the uncertainty of the joint r.v. {X,Y). it Is defined as 

- X i^{X = X, Y = y) log i^{X = x,Y = y) 

Conditional entropy H{X\Y) measures the uncertainty about X given 
knowledge of Y. It is defined as H{X,Y) - H{Y). The higher H{X\Y) 
is, the lower is the correlation between X and Y . It is easy to see that 
if X is a function of Y , then H{X\Y) = (there is no uncertainty on X 
knowing F if X is a function of Y) and if X and Y are independent then 
H{X\Y) = H{X) (knowledge of Y doesn't change the uncertainty on X if 
they are independent) . 

Mutual information I{X\Y) is a measure of how much information X 
and Y share. It can be defined as 

I{X;Y) = H{X) - H{X\Y) = H{Y) - H{Y\X) 

Thus the information shared between X and Y is the information of X (resp 
Y) from which the information about X given Y has been deduced. This 
quantity measures the correlation between X and Y. For example X and Y 
are independent iff I{X; Y) = 0. 

Mutual information is a measure of binary interaction. Conditional mu- 
tual information, a form of ternary interaction will be used to quantify 
leakage. Conditional mutual information measures the correlation between 
two random variables conditioned on a third random variable; it is defined 
as: 

I{X;Y\Z) = H{X\Z) - H{X\Y, Z) = H{Y\Z) - H{Y\X, Z) 

2.5. Measures on the lattice of information. Suppose we want attempt 
to quantify the amount of information provided by a point in the lattice of 
information. 

We could for example associate to a partition P the measure |P| = "num- 
ber of blocks in P'' . This measure would be 1 for the least informative 
partition, its maximal value would be the number of atoms and would be 
reached by the top partition. It is also true that A Q B implies |y4.| < \B\ 
so the measure reflects the order of the lattice. An important property of 
"additivity" for measures is the inclusion-exclusion principle: this principle 
says that things should not be counted twice. In terms of sets, the inclusion- 
exclusion principle says that the number of elements in a union of sets is 
the sum of the number of elements of the two sets minus the number of 
elements in the intersection. The inclusion-exclusion principle is universal 
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e.g. in prepositional logic the truth value of j4 V i? is given by the truth 
value of A plus the truth value of B minus the truth value oi A A B. 
in the case of the number of blocks the inclusion-exclusion principle is: 

\AUB\ = \A\ + \B\ -\AnB\ 

Unfortunately this property does not hold. As example, by taking 

^ = {{1,2}{3,4}}, i? = {{l,3}{2,4}} 

as two partitions, then their join and meet will be 

AUB = {{1}{2}{3}{4}}, AnB = {{1, 3, 2, 4}}. 

hence \AU B\ = A ^ 3 = \A\ + \B\ - \An B\. 

Another problem with the map j j is that when we consider Lol as a lattice 
of random variables the above measure may end up being too crude; in fact, 
all probabilities are disregardecH by | |. To address these problems more 
abstract lattice theoretic notions have been introduced in the literature [B]. 

A valuation on Lol is a real valued map ly : Lol^ M, that satisfies the 
following properties: 



(4) iy{X UY) = iy{X) + iy{Y) - iy{X nY) 

(5) X QY implies < u{Y) 

A join semivaluation is a weak valuation, i.e. a real valued map satisfying 

(6) u{XUY)<iy{X) + iy{Y)-u{XnY) 

(7) X QY implies u{X) < iy(Y) 



for every element X and y in a lattice |B]. The property ([5]) is order- 
preserving: a higher element in the lattice has a larger valuation than ele- 
ments below itself. The first property ([6]) is a weakened inclusion-exclusion 
principle. 

Proposition 1. Entropy is join semivaluation on Lol by defining 
(8) u{XUY) = H{X,Y) 

Proof. PropertyOis well known; for inequality [6] start from the known equal- 
ity 

H{X, Y) = H{X) + H{Y) - I{X; Y) 
it will be hence enough to prove that 

H{XnY) < I{X;Y) 

This can be proved by noticing that 

(1) H{X n y) = I{X n Y; X) this is clear because I{X n Y; X) measure 
the information shared between Xny and X and because Xny C X 
such measure has to be H{X □ Y) 



We will see however in later sections how the number of blocks relates to Information 
Theory and channel capacity 
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(2) I{X nY;X) < I{Y; X) this is clear because XnY QY hence there 
is more information shareable between Y and X than between Xny 
and X 

combining we have 

H{x n y) = I{X nY;X) < I{Y; X) 

□ 

2.6. Note: Entropy as the best measure on Lol. An important result 
proved by Nakamura [N| gives a particular importance to Shannon entropy 
as a measure on Lol. He proved that the only probability-based join semi- 
valuation on the lattice of information is Shannon's entropy. It is easy to 
show that a valuation itself is not definable on this lattice, thus Shannon's 
entropy is the best approximation to a probability-based valuation on this 
lattice. 

Nakamura starts by considering a family of function {fn)neN such that 
/„ is defined on a set of n probabilities pi, . . . ,pn and satisfies: 

(1) /n is continuous 

(2) /„ is permutation invariant, i.e. fnipi, ■ ■ ■ ,Pn) = fniPn{i),- ■ ■ :P-K{n)) 
for any permutation vr 

(3) /„+l(pi,...,p„,0) = fn{pi,...,Pn) 

Such a family {fn)n<^N induces a function F on partitions with n blocks 
X = {Xi, . . . , Xn} with block Xi having probability pi: 

F{X) = fn{pi,...,Pn) 

Suppose now that 

(1) F is a join-semivaluation on all lattices of partitions 

(2) If two partitions X, Y are independent (in probability theory sense) 
then 

F{X UY) = F{X) + F{Y) 

Nakamura's result is then that such a function F is, up to a constant. 
Shannon's entropy function, i.e. 

F{X) = fn{pi, . . . ,Pn) = -C X] Pi^'^SiPi) 

l<i<n 

3. Lattice of Information, expected probability of guessing, 
expected number of guesses and entropy 

This section contains the main results of this article, i.e. correspondence 
between the order relation of Lol, expected probability of guessing, expected 
number of guesses and entropy. 
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3.1. Expected probability of guessing. We want to define, given an 
equivalence relation, the average probability of guessing the secret in n tries. 

Given a set X where each element has associated a probability (w.l.g. we 
assume the probabilities being ordered decreasingly i.e. //(xj) > 
define the probability of guessing the secret in n tries as 



l<i<n 

Given a partition X and a distribution /x the probability of guessing the 
secret in n tries is 

Gn,iJ.{X) = X 9n,iJ.{Xi) 

As an example consider the partition 

{{xi,...,xa}{x^,xq}} 

where the first four atoms have probability ^ each and X5 , x& have proba- 
bility I each. 

Then the average probability of guessing the secret in 2 tries is | + 1 = | ; 
indeed after the observations and two tries the probability of non guessing 
the secret is | corresponding to not having exhausted all possibilities from 
the first block. 

Notice that the above definition is the same as having a probability dis- 
tribution on each block, computing the probability of guessing the secret in 
each block and then taking the weighted average: 

GnAX) = E 9n,M^) = E E ^ 

When clear from the context we will omit the subscript ji from G and g. 
Theorem 1. 

X C F ^ VM,n. GnAX) < GnA^) 

Proof. Step 1: 

^ C y ^ VM,n. GnA^) < GnA^) 

w.l.g. it will be enough to consider a block in X splitting into two blocks 
Yi,Yj in Y; we then need to prove that 

gn{Xi) < gn{Yi) + gn{Yj) 

We can write gn{Xi) = X]j<; /^(-^O+X^jxj f^i^j) where the Xi are elements in 
the block Yi and the xj are in Yj . We can then write gniXi) as ^j</ fJ,{xi)+Ci 
where Cj > is the sum of the elements in gniYi) which are not in gn{Xi) 
and similarly gn{Yj) can be written as '^Zjkj P'i^j) + '^j- 
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We have hence 

gn{Yi) + gn{Yj) = 
'^fJ,{Xi) + Ci + '^fl{Xj) + Cj > 

i<I j<J 

9n{Xi) 

Step 2: 

(V^,n. Gn{X)<Gn{y))^X\ZY 

Reason by contradiction: suppose X ^ y, w.l.g. we can then find a block 
E y included in two (or more) blocks in X; We then take a distribution 
everywhere apart from the elements in Yi and apply the previous reasoning, 
then for this distribution ^ GniY) by taking n = |yi| — 1 □ 

As an example consider the partitions 

X = {{1,2}{3,4}}, y = {{l,3}{2,4}} 

X and Y are not order related because no block in X is refined by a block 
in Y and vice-versa; hence following the theorem we can find distributions 
and number of guesses ordering them in any order: for G(Y) < G{X) take 
the distribution giving | to 1,3 and elsewhere; then n = |{1,3}| — 1 = 1 
and so we have 

= <7i({l, ^}) = l<l + l = 2}) + 5i({3, 4}) = 

Likewise for G{X) < G{Y) choose the distribution giving i to 1,2 and 
elsewhere. 

Corollary 1. 

XQY^yis Gi,^(X) < Gi,^(y) 

Proof. Direction =^ is the same as theorem [U direction <^= is also similar: 
just notice that the choice of n = — 1 in G^^^ implies n > 1 because as 
Yi is split in several blocks it must have at least 2 elements, hence we can 
replace — 1 with 1 □ 

3.2. Expected number of guesses. The expected probability of guessing 
should be related to the expected number of guesses. 

Given a set X where each element has associated a probability (w.l.g. we 
assume the probabilities being ordered decreasingly i.e. //(xj) > /i(xj+i)) 
define the expected number of guesses as 

NG^iX)= ^ i/i(x,) 

l<i<n 
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Given a partition X and a distribution ji the expected number of guesses is 
(we abuse the notation): 

NG^{X) = J2 NG^iXi) 

Intuitively the more is known of the secret the less guesses are needed, 
hence we should expect the NG order to reverse the Lol order; consider for 
example the set {a, b, c, d} with probabilities 5,3,5,5 respectively; we have 
then 

NGi{{a,b,c,d}}) = ^ > H = NG{{{a,d}{b,c}}) 
We can now show that Lol order is the dual of the NG order: 
Theorem 2. 

X C y ^ Vm, NGi,{Y) < NGi,{X) 

Proof. 

XQY ^^fi, NG^{Y) < NG^{X) 
w.l.g. it will be enough to consider a block X^ in X splitting into two 
blocks Yi,Yj in Y\ consider an element x G X^; this element will appear as 
a term jfJ,{x) in the sum NG^{Xi). As the elements of Xi are split in the 
two sets Yi, Yj then the same x will appear in NG(Yi) or in NG(Yj): in any 
case it will appear as a term j'n{x) where j' < j because Xj is split in the 
two sets Yi , Yj so the relative order of a; in 1^ or Yj has to be less than the 
relative order of x in X^. Hence the statement is true. 

X C y ^ V/x, NG^{Y) < NGf,{X) 

Reason by contradiction: suppose X %Y, w.l.g. we can then find a block 

G y included in two (or more) blocks in X; We then take a distribution 
everywhere apart from the elements in Yi and apply the above reasoning: 
then for this distribution NG{Y) ^ NG{X) □ 

3.3. Entropy and Lol. The next fundamental result is about entropy; 
again we can relate entropy to order in Lol. Two partitions are order related 
if and only if they are entropy related (in the same direction) for all possible 
distributions 

Theorem 3. 

X c y ^ Vm, H^{X) < H^{Y) 

Proof. Step 1: 

X^Y^^fx, H^{X) < H^{Y) 

This is a well known property of entropy: taking larger probabilities reduce 
entropy and it is a consequence of the Jensen inequality 
Step 2: 

XEl^^V/x, H^{X)<H^{Y) 
Reason by contraposition; 
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suppose X 2 y, w.l.g. we can then find a block Yi Y included in 
two (or more) blocks in X (say Xi . . . We then take a distribution 
everywhere apart from the elements in Yi; notice that for such a distribution 
fi{Yi) = 1 whereas in X there are more two or more blocks with non zero 
probability: we have hence 

H{X) = - J2 KXi)log{fi{X,)) > = -fi{Yi)log{fx{Yi)) = H{Y) 

l<i<n 

□ 

3.4. Shannon's order of information. The Lattice of Information was 
pioneered in a little known note by Shannon [S2] in order to characterise 
information. 

One of Shannon's motivations was that while Information Theory is a 
measure of information it is not a characterisation of it. Information Theory 
aims to measure the amount of information of random variables or of some 
sort of stochastic process: what the information is about is not a concern 
of the theory, the measure is based on the number of distinctions available 
in an information context. As an example consider the information-wise 
very different processes "flipping a coin" and "presidential election between 
two candidate". While the first is a rather inconsequential process and the 
second may have important consequences they are both contexts allowing 
for two choices hence they both have an information measure of (at most) 1 
bit. In a context where n choices are possible (a process with n outcomes) 
the information associated is measured in terms of the number of bits needed 
to encode those possible choices, so it is at most log2(n). Hence completely 
different information contexts may result in the same information theoretical 
measure. 

We can however try to characterise "information" using Information The- 
ory. In the above example while "flipping a coin" and "presidential election 
between two candidate" may have the same measure, it is not the case that 
knowing one of the two gives information about the other, so H{X\Y) > 
for X, Y being one of "flipping a coin" or "presidential election" . 

Given random variables X, Y Shannon's order is deflned by: 

X <dY ^ H{X\Y) = 

The intuition here is that Y provides complete information about X, or 
equivalently X has less information than Y, so X is an abstraction of Y 
(some information is forgotten). 

Shannon also defined the related distance function: 

d{X, Y) = H{X\Y) + H{X\Y) 
The function d and the relation <fi are related as follows: 



d{X, Y) = {)^ X <dY hY <dX 
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In fact suppose d{X,Y) = 0; then H{X\Y) + H{X\Y) = so as con- 
ditional entropy is non negative X <d Y AY <d X . On the other hand 
X <dY AY <dX imphes H{X\Y) = 0, H{Y\X) = so d{X, Y) = 0. 

The equivalence classes of the order <d, i.e. points s.t. X <dY AY <(i X 
or equivalently the sets of points of distance 0, are the information theoretical 
characterization of information: all items in a class can be seen as objects 
having the same information, not just sharing the same measure. 

Shannon's order and Lol order are the same: 

Theorem 4. 

X QY X <dY 

Proof. Direction X QY ^ \/fi.X <d Y: 
By definition of join in a lattice 

X QY ^ XlAY = Y 

hence we have 

X QY ^ H{X, Y) = H{X UY) = H{Y) 

and so 

H{X\Y) = H{X, Y) - H{Y) = H{Y) - H{Y) = 

which proves V/x. X <dY 

For the other direction assuming X ^Y then X IZ X U y so we can find 
a distribution s.t. H{X UY) > H{X) and so 

H{Y\X) = H{X UY)- H{X) > H{X) - H{X) = 

and we conclude X -^d^ D 

Shannon also noticed that d defines a pseudometric and so the quotient 
space by the equivalence classes of points of distance is a metric space. 

4. Measuring leakage of programs 

Now we want to connect Lol with leakage of confidential information in 
programs. 

4.1. Observations over programs. Observations over a program P form 
an equivalence relation on states of P. A particular equivalence class will be 
called an observable. Hence an observable is a set of states indistinguishable 
by an attacker making that observation. 

The above intuition can be formalized in terms of several program seman- 
tics. We will concentrate here on a specific class of observations: the output 
observations |M2t IMlj . For this observation the random variable associated 
to a program P is the equivalence relation on any two states cr, a' from the 
universe of states S defined by 

(9) a^a' ^ [^'1(^) = 1^1(^0 

where |P] represents the denotational semantics of P. Hence the equiva- 
lence relation amounts to" have the same observable output". We denote 
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the interpretation of a program P in Lol as defined by the equivalence re- 
lation ([9]) by Lol(i-'). According to denotational semantics commands are 
considered as state transformers, informally maps which change the values 
of variables in the memory; similarly, language expressions are interpreted as 
maps from the memory to values. The equivalence relation Lol(P) is hence 
nothing else than the set-theoretical kernel of the denotational semantic of 
P. Assuming that the set of confidential inputs h is equipped with a proba- 
bility distribution /x we can see LoI^(P) as a random variable. We will write 
simply Lol(P) unless we need to specify a specific distribution /i. 

4.2. Lol interpretation of programs and basic properties. In this pa- 
per we will consider the well known while programming language |Wj , that 
is a simple imperative language with assignments, sequencing, condition- 
als and loops. Syntax and semantics for the language are standard, as in 
e.g. [Wj . The expressions of the language are arithmetic expression, with 
constants 0, 1, . . . and boolean expressions with constants tt, f f . 

To see a concrete example, let P be the program 

if (h==0) then x=0; else x=l; 

where the variable h ranges over {0,1,2,3}. We will assume for the time 
being that in all program we consider the low variables are initialized in the 
code; we will discuss this assumption in section [5l 

The equivalence relation (i.e. partition) Lol(P) associated to the above 
program is then 

Lol(P) = {^{1^2^} 
x=0 x=i 

Lol(P) effectively partitions the domain of the variable h, where each dis- 
joint subset represents an output. The partition reflects the idea of what 
an attacker can learn of secret inputs by backwards analysis of the program, 
from the outputs to the inputs. 

The quantitative evaluation of the partition Lol(P) measures such knowl- 
edge gains of an attacker, solely depending on the partition of states and 
the probability distribution of the input. 

4.3. Definition of leakage. Let us start from the following intuition 

The leakage of confidential information of a program is de- 
fined as the difference between an attacker's uncertainty about 
the secret before and after available observations about the 
program. 

For a Shannon-based measure, the above intuition can be expressed in terms 
of conditional mutual information. In fact if we start by observing that the 
attacker uncertainty about the secret before observations is H[h\l) and the 
attacker uncertainty about the secret after observations is H{h\l,Lol[P)) 
then using the definition of conditional mutual information we define leakage 
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as 

H{h\l) - H{h\l,Lol{P)) = I{h;Lol{P)\l) 
We can now simplify the above definition as follows 

I{Lol{Py,h\l) = H{h\l) - H{h\l,Lol{P)) 

= H{Lol{P)\l) - H{Lol{P)\l, h) 
=A H{Lol{P)\l)-0 
= H{Lol{P)\l) 
(10) =B H{Lol{P)) 

where in the first equality we used the symmetry of conditional mutual 
information; the equality A holds because the program is deterministic and 
B holds when the program only depends on the high inputs, for example 
when all low variables are initialised in the code of the program; we will 
discuss this assumption in the next section. Thus, for such programs 

Leakage: (Shannon-based) leakage of a program P is de- 
fined as the (Shannon) entropy of the partition Lol(P). 

We can now apply the results from section [3] in the context of programs, 
hence we deduce a correspondence between the refinement order of the ob- 
servations, leakage, expected probability of guessing and expected number 
of guesses. 

In terms of programs the results from section [3] state the following equiv- 
alences: 

(1) Lol(P) □ Lol(P') 

(2) V//. H^{Lol{P)) < H^{Lol{P')) 

(3) Vn,/i. G„,^(Lol(P)) < G„,^(Lol(P')) 

(4) V/i. A^G^(Lol(P')) < A^G^(Lol(P)) 

In words: The equivalence relation associated to a program P is refined by 
the equivalence relation associated to a program P' if and only if for all 
distributions the leakage of P is less than the leakage of P', if and only 
if for any number of tries and any distribution the expected probability of 
guessing the secret is less according to P than it is according to P', if and 
only if for all distributions the expected number of guesses required to guess 
the secret according to P is greater than the expected number of guesses 
required to guess the secret according to P'. 

4.4. Relation with Yasuoka and Terauchi ordering results. These or- 
der results are related to some recent work by Yasuoka and Terauchi [YTlj : 
they define quantitative analysis in terms of Shannon entropy. Smith's vul- 
nerability and guessability. 

Their definitions follows the pattern we discussed before: 

The quantitative analysis of confidential information of a 
program is defined as the difference between an attacker's 
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capability before and after available observations about the 
program. 

By replacing the word "capability" with: (A) uncertainty about the se- 
cret, (B) probability of guessing the secret in one try, (C) expected number of 
guesses we derive different quantitative analysis. Once formalized (A)(B)(C) 
as a function F (and also its conditional counterpart F{—\—) ) on a proba- 
bility space all these definitions will have the form: 

F{h\l) - F{h\l,Lol{P)) 

Formally the choices for F,F{—\—) are: 

(A) for uncertainty about the secret F and F{—\—) are Shannon entropy 
and conditional entropy 

(B) for probability of guessing in one try (noted ME) 

F(X) = -log( max fi(X = x)) and F(X\Y) = -log(V fi(y)(maxn(X = x\Y = y))) 

y&Y 

(C) for the expected number of guesses (noted GE) 

F{X)= il^{X = x^)andF{X\Y) = Y,l^{v){ E = Xi\Y = y)) 

XieX,i>l yeV Xi<=X,i>l 

(assuming i < j implies fi{X = x.i) > ^{X = xj)) 
Shannon's entropy is unique in that 

(1) conditional mutual information is symmetric, so for F being Shan- 
non's entropy. 

F{h\l) - F{h\l,Lol{P)) = F{Lol{P)\l) - F{Lol{P)\l,h) 
and 

(2) entropy of the result of a function given its arguments is so 

F{Lol{P)\l,h) = 

In particular and again considering low inputs intialized in the program 
it is only when F is Shannon's entropy that 

F{h) - F{h\Lol{P)) = F(Lol(P)) 

what this mean is that 

It is only when using Shannon entropy that leakage as the dif- 
ference in capability before and after observations is a mea- 
sure on Lol 

We now want to relate results from section [3] with ME and GE definitions 
of leakage. 

To appreciate the difference in the definitions let's consider the examples 
from [ YTlj : we consider the following programs: 

(1) Ml = if(h == l)o = 0;else o = 1; 

(2) M2 = = h; 
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Table [1] shows the results of analyses of these programs for a 2 bits se- 
cret uniformly distributed. Columns H, G, NG corresponds to our def- 
initions for Shannon entropy, the expected probability of guessing (in 1 
guess) and the expected number of guesses on Lol(P), i.e. H, G, NG 
stands for g(Lol(P) ), G(L ol(P)), iVG'(Lol(P)). ME and GE corresponds 
to the definitions in [YT1| for computing the min entropy and the guess- 
ing entropy on P; the final two columns ME' and GE' corresponds to ap- 
ply the definitions in [YTl] directly to LoI(P). For example ME{Mi) = 
ME{h) - ME{h\Lol{Mi)) and ME'{Mi) = ME{Lol{Mi)). 



Table 1. comparing measures 





H 


G 


NG 


ME 


GE 


ME' 


GE' 


Ml 


0.8112 


0.5 


1.75 


1 


0.75 


0.415 


1.25 


M2 


2 


1 


1 


2 


1.5 


2 


2.5 



The results express different ideas which can be connected in a uniform 
narrative. Take program Mi : G = 0.5 means after running the program an 
attacker has probability 0.5 of guessing the secret in one try. The chances of 
guessing the secret have doubled from 0.25 (before the program) to 0.5 (after 
the program) , so the rate of increase is 

2M£(Mi) ^ the average number of 
questions needed (initially 2.5) has been reduced by 0.75 (GE=0.75) so that 
it will take now on average to guess it NG=1.75 tries. And the observations 
provide 0.8112 bits of information about the secret. 

Consider now the second row, i.e. program M2: here H = 2 means 
that everything is leaked, i.e. the observations provide 2 bits of information 
about the secret. In this case we are sure to guess the secret in one try 
(G=l, NG=1) and our chances have hence increased 4 folds from the initial 
probabilities (2*^^(*^2) = 2^ so 0.25 * 2*^^(^^2) = 1); the average number of 
questions needed (initially 2.5) has been reduced by 1.5 (GE=1.5) to one 
(NG=1). 

We have left out from the narrative the measures ME' , GE' . The reason 
is that they seem of limited interest; for example ME' will always pick 
the most likely observation and disregard all the others: a dubious security 
measure. 

The narrative can be strengthened formally: 

Proposition 2. For a program P 

(1) V^. 2{A^-^(^))G(/i) = G(Lol(P)) 

(2) V^. GE{P) = NG{h) - NG{Lol{P)) 

Proof. (1) We start by recalling Smith's definition of vulnerability: 
ME{P) = log 3—— - log ■ ^ 



maxh /x(/i) EoeLoi(P) max;,(/i|o) 
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We have then 

— 2'°S J^oeLoi{P) maxh(/i|o)-logmaxh ^^WQ^^^^ 
o'og J^oeLoi{P) maxh(h|o) 

= \ TT^ Gih) 

2log maxh pL{h) ^ ' 

EoeLoi(P) max?,(/iIo) 



EoGLoi(P) max/,(/i|o^ 



G{h) 



max 



maxfi ii{h) h 

= max(/i|o) 

oeLoi(P) 
= G(Lol(P)) 

(2) We can rewrite the definition of GE(P) from |YT1] as: 

GE{P) = i^^ih,) - Yl E ^/^(^*) 

l<i<n ogLol(P) hiGo,l<i<'m 

It is easy to see that the first term coincides with our definition on NG on 
sets and the second term with our definition of NG on partitions; the result 
then follows. □ 

The connections between these concepts extends to the orders they in- 
duce: 

Theorem 5. Given programs P,P' (non depending on the low inputs) the 
following are equivalent: 

(1) Lol(P) □ Lol(P') 

(2) V/i. Lol(P) <d LoI(P') 

(3) V/i. H^(Lol{P)) < H^iLol{P')) 

(4) yn,fi. G„,^(Lol(P)) < G„,^(Lol(P')) 

(5) yfi. iVG^(Lol(P')) < iVG^(Lol(P)) 

(6) V/i. ME^iP) < ME^{P') 

(7) V/i. GE^{P) < GE^{P') 

Proof, equivalence 1 44> 3 was first proved in |HM3j . equivalences 1 2, 4, 5 
proved in section [Sj equivalences 1 44> 3, 6, 7 are proven in [YT1| . It may be 
however interesting to reprove the equivalences in [YTl] using the algebraic 
techniques and results from this paper. For example we can prove 1 44> 6 as 
follows: 

Lol(P) C Lol(P') ^ V/i. G(Lol(P)) < G(Lol(P')) 

^ V/i. 2(^^^(^))G(/i) < 2(^^^(^'))G(/i) 
^ V/i. ME{P) < ME{P') 
where the first equivalence is corollary [T] and the second is proposition [21 
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1 4^ 7 follows from Proposition [2l[2):just rewrite it as GE{P) = NG{h) — 
NG{Lol{P)) □ 

Hence we conclude that in terms of the induced orderings all these quan- 
titative analyses are consistent. In other words it is only on programs not 
ordered on Lol that these notions can really differ. One such difference is 
now discussed. 

4.5. Discussion on Smith's argument on the foundations of Quan- 
titative Information Flow. Consider the following two programs [Smj : 

(1) Pi = if (h%8 == 0) o = h; else o = 1; 

(2) P2 = o = h&;037; 

The program Pi will return the value of h when the last three bits of the 
secret are Os and will return 1 otherwise; its Lol interpretation will hence 
be the partition of the form 

X = {{/iiOOO},...,{/i„000},Xi} 

where the hi are arbitrary binary string of length k — 3. 

The program P2 copies the last 5 bits of the secret in o (here 037 is the 
octal constant and & the bitwise and). The partition associated has hence 
the shape 

Y = {Y,,...,Yr} 

where each 1^ is a set of string with the same 5 last bits. 

Smith's argument is that under uniform distribution and for a secret of 
size 8A; bits the two programs have a very similar entropy but they have a 
very different guessing behaviour; in the case of the first program in fact 
with probability one eight the whole secret is revealed, while in the second 
program all attempts reveal the last 5 bits of the secret but give no indication 
of what the remaining bits are. Hence in general it is much easier to guess 
the secret in one try after running the first program than it is to guess the 
secret after running the second one. 

The argument however relies on choosing a particular distribution; this 
choice is independent from the source code and should, we believe, be clearly 
separated from the leakage inherent to the code. 

In fact since the partitions X, Y are unrelated in Lol, by the results from 
section [3] we can find distributions and number of guesses that make one's 
expected guessing probability less than the other. 

For Gn{X) < GniY) notice that Xi splits in many blocks 1^: hence take 
any distribution non zero only on the atoms in Xi e.g. let's consider the 
uniform distribution on the atoms in Xi and take n = \Xi \ — 1. 

Then 

Gn{X) = gn{Xi) = —— < 1 = Gn{Y) 
n + 1 

To make > Gn{Y) pick any block 1^ in y whose last three bits are 

Os; then this block is split in many XjS in X, again by taking the distribution 
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uniform over the elements of Yi and otherwise and taking n = \Yi\ — 1 we 
have 

77 

Gn{Y) = gn{Yi) = —— < 1 = GniX) 
n + 1 

In fact all distributions giving probability to all values divisible by 8 will 
favour program P2 even when we consider a single guess (n=l), and things 
don't change when we take ME instead of G„. 

Similarly we can find distributions that make the expected number of 
guesses of any of the two programs less than the expected number of guesses 
of the other program. In particular while for the uniform distribution it is 
much easier to guess the secret in the case of the first program compared 
to the second (which is at the heart of Smith's argument), by choosing the 
distribution zero everywhere apart from the block Xi it become easier to 
guess the secret using the second program. While such a distribution may 
be seen as pathological it still shows the possible problems in making code 
analysis dependent on particular distributions. 

4.6. Lol, maximum leakage and Channel Capacity. The relation be- 
tween Lol and channel capacity has been investigated in the literature 
[MCI IYT21 IKS] . The channel capacity of a program is defined as the maxi- 
mum possible leakage for that program. Intuitively this is the context most 
advantageous for the attacker. Lol provides an elementary characterization 
of channel capacity: in fact as the leakage is defined by H{Lol{P)) using 
the well known information theoretical fact that the maximal entropy over 
a system with n probabilities is log(n) we deduce that the channel capacity 
is log(|Lol(P)|). 

We note by CC(P) for the channel capacity of the program P. We have 
then 

Proposition 3. 

Lol(P) □ Lol(P') ^ CC(P) < CC(P') 

If Lol(P) C Lol(P') then all blocks of Lol(P) are refined by blocks of 
Lol(P') so the number of blocks of Lol(P) is < than the number of blocks 
of Lol(P'), but the channel capacity for programs is the log of the number 
of blocks interpretation, hence the result is proved. 

The opposite direction of the implication doesn't hold: for example the 
partitions 

{{a, b, c}, {d}} and {{a, b}, {c, d}} 
are not order related but have the same channel capacity 1. 

4.6.1. Lol and min-entropy Channel Capacity. The relation between chan- 
nel capacity of a program P and log(|Lol(P)|) is not confined to Shannon 
entropy. In fact Kopf and Smith have shown that even if we choose Smith's 
min-entropy quantitative analysis |KS| we get the same value, i.e. max- 
imum vulnerability of a program P according to Smith measure ME is 
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log(|Lol(P)|). We hence have the equahties 

CC(P) = log(|Lol(P)|) = maxiJ^(Lol(P)) = maxMEf,{P) 



5. Low INPUTS, Multiple runs and l.u.b. in LoI 

A major source of confusion in security analysis derives from poorly de- 
fined attacker models. In this section we discuss a few common modelling 
issues and how they can be dealt with in LoI. 

5.1. Active and passive attackers. The lattice of information allows for 
different attacker's models: the most common and possibly interesting is 

the one corresponding to an active attacker, i.e. an attacker who control the 
low inputs; a typical example would be a cash machine where an attacker is 
able to choose a pin number. An active attacker can be modelled as we did 
in the previous sections by assuming that the low variables are initialised in 
the code, the initialisation values corresponding to the attacker choice. 

We could however also model a passive attacker, an eavesdropper with no 
power to choose the low inputs. In this case the lattice atoms are the pair 
of low and high inputs. Take for example the program 

if (h == 1) o= 1; else o=2; 

where h, I are 2 bits variables. The partition associated to the programs 

is: 

{{(0, 0)}, {(0, 1), (0, 2), (0, 3)}, . . . , {(3, 3)}, {(3, 0), (3, 1), (3, 2)}} 

assuming uniform distribution on the low and high inputs we then compute 
leakage as 

HiLol{P)\l) = 4^ log(4) + 4^ log(^) = 0.60375 

In fact an active attacker is a particular case of this setting, where the 
distribution on the inputs is such that only one low input has probability 
non-zero. In that case the atoms of the lattice are, up to isomorphism, only 
the high inputs and H(Lol{P)\l) = iJ(Lol(P)). 

5.2. Non termination. In this work we have mostly considered output 
observations as values. We can however relax this and include among the 
possible observations non termination. This doesn't change the theory: non- 
termination is just an additional equivalence class: the class of all input 
states over which the program doesn't terminate; of course the usual com- 
putational and complexity problems arise when we try to compute such a 
class. 
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5.3. Multiple runs. Another aspect of an attacker model that has a nat- 
ural algebraic interpretation in Lol is an attacker capability to run the sys- 
tem n times: for example an attacker trying three pin numbers on a cash 
machine. Running a program several times with different low inputs may 
reveal more and more information about the secret; For example consider 
the password checking program P 

if (h == 1) 0= 1; else o=2; 

If we run it once assigning the value 5 to the low variable we gain the 
information whether the secret is 5 or not; by running it twice, assigning to 
the low variable the value 5 and the value 7 we will gain the information 
whether the secret is 5 or is 7 or something else. 

Written in terms of partitions this is nothing else than the join operation 
in Lol 

{{5}, 5}} U {{7}, 7}} = {{5}, {7}, 5, 7}} 

Hence the knowledge available to an attacker who can choose the low 
inputs and run the program m times is modelled by the partition 

Lol(Pi) U • • • ULol(Pm) 

where Lol(Pj) is the partition corresponding to the i-th run of the program. 

5.3.1. Does it leak the same information? A related question is whether a 
program leaks always the same information for each run of the program; 
for example a program leaking the last bit of the secret always leaks the 
same information no matter how many times we run the program but a 
password check leaks different information when we run it choosing different 
low inputs. This question can also be addressed by using l.u.b.: if the 
program P leaks different information over different runs this means we can 
find two runs Pj, Pj such that 

Lol(Pj) ULol(P,) > Lol(Pj),Lol(P,) 

The interpretation of multiple runs in terms of l.u.b.s has also somehow 
a reverse implication, i.e. it is possible, given programs Pi , P2 to build 
a program whose interpretation is their l.u.b. This result has a practical 
significance: when Pi,P2 are different runs of the same program the l.u.b. 
is their self-composition |GARj . Formally [MH| : 

Proposition 4. Given programs Pi,P2 there exists a program Piu2 such 
that 

Lol(Piu2) = Lol(Pi) U Lol(P2) 

Given programs Pi,P2, we define Piu2 = Pi-,P2 where the primed pro- 
grams P[,P2 are Pi,P2 with variables renamed so to have disjoint variable 
sets. If the two programs are syntactically equivalent, then this results in 
self-composition [GAR] For example, consider the two programs 

Pi = if (h == 0) X = 0; else x = 1;, P2 = if (h == l) x = 0; else x = 1; 
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with their partitions Lol(Pi) = {{0},{h ^ 0}} and LoI(P2) = {{1}, {/^ / 
1}}. The program Piu2 is the concatenation of the previous programs with 
variable renaming 

Piu2 = h' = h; if (h' == O) x' = 0; else x' = 1; 

h" = h; if (h" == l) x" = 0; else x" = 1; 

The corresponding lattice element is the join, i.e. intersection of blocks, of 
the individual programs Pi , P2 

Lol(Piu2) = {{0}, {!}, {h / 0, 1} = {{0}, {h + 0}} U {{!}, {h + 1}} 

6. Further applications of LoI 
We quickly review two applications of LoI beyond the foundational aspect: 

6.1. Loop analysis. Loop constructs are difficult to analyse. However they 
have a natural interpretation in the lattice of information. In informal terms 
the idea is that loops can be seen as l.u.b. of a chain in the lattice of 
information, where the chain is the interpretation of the different iterations 
of the loop. 

To understand the ideas let's consider the program 
1=0; 

while (1 < h) { 

if (h==2) 1=3 else 1++ 

} 

and let us now study the partitions it generates. The loop terminating 
in iterations will reveal that h=0 i.e. the partition Wq = {{0}{1, 2, 3}}, 
termination in 1 iteration will reveal h=l if the output is 1 and h=2 if the 
output is 3 i.e. Wi = {{1}{2}{0, 3}}, the loop will never terminate in 2 
iterations i.e. W2 = {{0,1,2,3}} and in 3 iterations will reveal that h=3 
given the output 3, i.e. W3 = {{3}{0, 1, 2}}. Let's define W<n as \Jn>i>oWi; 
we have then the chaiifl 

W<i = W<2 = W<3 = {{0}{1}{2}{3}} 

We also introduce an additional partition C to cater for the collisions in the 
loop: the collision partition is C = {{0}{1}{2, 3}} because for h=2 the loop 
terminates with output 3 in 1 iterations and for h=3 the loop terminates 
with output 3 in 3 iterations. We have then 

Lol(P) = Un>oW<n nC = {{0}{1}{2, 3}} 

This setting is formalized in [MHj . Given a looping program P define 
n as the equivalence relation corresponding to the output observations 
available for the loop terminating in < n iterations and let the collision 
equivalence of a loop be the reflexive and transitive closure of the relation 
cr o"' iff o", a' generate the same output from different iterations. 



the chain is trivial in this example 
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The following is then true: 
Proposition 5. 

Loi(P) = u„>oTy<„ n C 

Hence leakage i^(Lol(P)) for looping programs can be computed in terms 
of the chain {W<_n)n>o and the collision equivalence C. The equivalence of 
this technique with previous information theoretical analysis of loops |M2] 
is proved in [MH| . 

6.2. Analysis of C-code vulnerabilities. Recent work |HMlj based on 
the Lol interpretation of programs, has demonstrated the applicability of 
QIF to real world vulnerabilities. Previous attempts to implement a quan- 
titative analysis had hit a major hurdle: in very simple terms since QIF 
is based on Lol(P) and LoI(P) is the set theoretical kernel of the denota- 
tional semantics of P computing Lol(P) is computationally unfeasible. The 
approach followed in ^HMlj is to change the QIF question from computing 
Lol(P) to computing bounds on the channel capacity CC(P). We saw these 
concepts are related in theorem [3l Using assume-guarantee reasoning ques- 
tions about bounds can be expressed in verification term^. In particular 
by expressing them as drivers for the symbolic model checker CBMC [CKL] 
several CVE reported vulnerabilities in the Linux kernel were quantitatively 
analysed in [HMl] ; moreover the official patches for such vulnerabilities were 
formally verified as fixing the leak. That work is the first demonstration of 
quantitative information flow addressing security concerns of real-world in- 
dustrial programs. 

7. Conclusions 

We investigated the importance of the Lattice of Information for Quan- 
titative Information Flow. This lattice allows for an algebraic treatment of 
confldentiality and clarifies the relationship between the Information The- 
oretical, probabilistic and guessability measures that are used in QIF. Our 
results show that these measures are all consistent w.r.t. the classification 
of language based confidentiality threats, and this classification is captured 
by the refinement order in Lol. 

We have seen how these results fit and contribute to recent work in the 
community, especially the ones by Yasuoka and Terauchi [YTlj and by Smith 
[Smj . It is a matter for future research to determine whether the Lattice 
of Information can also provide a unifying foundation for non-deterministic 
and probabilistic systems. 
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